Fast and accurate mapping of Complete Genomics reads.

نویسندگان

  • Donghyuk Lee
  • Farhad Hormozdiari
  • Hongyi Xin
  • Faraz Hach
  • Onur Mutlu
  • Can Alkan
چکیده

Many recent advances in genomics and the expectations of personalized medicine are made possible thanks to power of high throughput sequencing (HTS) in sequencing large collections of human genomes. There are tens of different sequencing technologies currently available, and each HTS platform have different strengths and biases. This diversity both makes it possible to use different technologies to correct for shortcomings; but also requires to develop different algorithms for each platform due to the differences in data types and error models. The first problem to tackle in analyzing HTS data for resequencing applications is the read mapping stage, where many tools have been developed for the most popular HTS methods, but publicly available and open source aligners are still lacking for the Complete Genomics (CG) platform. Unfortunately, Burrows-Wheeler based methods are not practical for CG data due to the gapped nature of the reads generated by this method. Here we provide a sensitive read mapper (sirFAST) for the CG technology based on the seed-and-extend paradigm that can quickly map CG reads to a reference genome. We evaluate the performance and accuracy of sirFAST using both simulated and publicly available real data sets, showing high precision and recall rates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes

The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been a...

متن کامل

Concurrent and Accurate Short Read Mapping on Multicore Platforms

In this paper we introduce a novel parallel work-flow-based aligner for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, named HPG Aligner W1, leverages the speed of the BurrowsWheeler Transform to map a large number of RNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The...

متن کامل

Enly: Improving Draft Genomes through Reads Recycling

The reconstruction of the complete genome sequence of an organism is an important point for comparative, functional and evolutionary genomics. Nevertheless, overcoming the problems encountered while completing the sequence of an entire genome can still be demanding in terms of time and resources. We have developed Enly, a simple tool based on the iterative mapping of sequence reads at contig ed...

متن کامل

Mapping and Expression Analysis of a Fusarium Head Blight Resistance Gene Candidate Pleiotropic Drug Resistance 5 (PDR5) in Wheat

Fusarium head blight (FHB) caused by Fusarium graminearum is a serious disease of wheat (Triticum aestivum L.), through which grain quality losses are induced by fungal trichotecene mycotoxins such as deoxynivalenol (DON). A class of plasma membrane localized ABC transporter proteins related to the yeast PDR5 (pleiotropic drug resistance5) efflux pump seems to be responsible for partial resista...

متن کامل

Read Mapping Algorithms for Single Molecule Sequencing Data

Single Molecule Sequencing technologies such as the Heliscope simplify the preparation of DNA for sequencing, while sampling millions of reads in a day. Simultaneously, the technology suffers from a significantly higher error rate, ameliorated by the ability to sample multiple reads from the same location. In this paper we develop novel rapid alignment algorithms for two-pass Single Molecule Se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Methods

دوره 79-80  شماره 

صفحات  -

تاریخ انتشار 2015